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Abstract 

This note provides a simple result showing, under suitable technical assumptions, that if a system S 
adapts to a class of external signals U, then E must necessarily contain a subsystem which is capable 
of generating all the signals in U. It is not assumed that regulation is robust, nor is there a prior 
requirement for the system to be partitioned into separate plant and controller components. 



1 Introduction 

Suppose that one knows that a certain system E adapts to (or "regulates against") all those external 
input signals u which belong to a predetermined class U of time-functions. (Input signals u are often 
thought of as disturbances to be rejected or signals to be tracked, depending on the application.) In 
this context, adaptation means that a certain quantity y{t) associated to the system, called its output 
(also called a regulated variable or an error) has the property that y{t) ^ as t — > oo whenever the 
system is subject to an input signal from the class U (Figure |l]). The internal model principle (IMP) 
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Figure 1: Given System, Regulated Output y{t) when Inputs in U 

states, roughly, that the system E necessarily must contain a subsystem Ejm which can itself generate 
all disturbances in the class U. The terminology arises when thinking of Ejm as a "model" of a system 
which generates the external signals. 

For example, if y(t) — > as < ^ oo whenever the system is subject to any external constant signal 
(i.e., the class U consists of all constant functions), then the system E must contain a subsystem Ejm 
which generates all constant signals (typically an integrator, since constant signals are generated by the 
differential equation u = 0). Of course, the choice of y = as the "adaptation value" is merely a matter 
of convention; by means of a change of variables, one may always reduce a given regulation objective 
"y(t) — > 2/o" where yo is some predetermined value, to the special case yo = 0. 

In addition, the IMP specifies that, in an appropriate sense, the subsystem Ejm must only have y 
as its external input, receiving no other direct information from other parts of the system nor the input 
signal u. One intuitive interpretation is that Eim generates its "best guess" of the external input u based 
on how far the output y is from zero. Pictorially, if we have the situation shown in Figure ^, then there 
must be a decomposition of the system E into two parts, as shown in Figure ^, where the system Ejm 
(with y = 0) is capable of reproducing all the functions in lA. 
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Figure 2: Decomposition of E into Eq and Ejm, the Latter Driven by y{t) 



The internal model principle originates in the biological cybernetics literature. But, as with any 
"principle" in control theory (like dynamic programming, the maximum principle, etc.) and more 
generally in mathematics, the IMP is not a theorem but rather a "mold" for many possible theorems, 
each of which will hold under appropriate technical assumptions, and whose conclusions will depend 
upon the precise meaning of "class of external signals" , "reproducing all functions" , and so on. 

The best known instance of an internal model theorem is due to Francis and Wonham, who in a 
series of beautiful and deep papers in the mid 1970s proved a theorem for linear systems which showed, 
in essence, that structurally stable or "robust" adaptation implies the existence of internal models. 
Partial generalizations of their work to nonlinear systems were later obtained by Wonham and Hepburn, 
see 0,1^-0. The Francis/ Wonham theory applies to systems E which are already partitioned 

into a "plant" plus a "controller" . The robustness assumption amounts to the requirement that the 
given controller should perform appropriately (in the sense that the regulation objective y{t) — > 
is achieved) even when the plant subsystem - but most definitely not the controller subsystem - is 
arbitrarily perturbed. The conclusion is that the controller is driven by y and incorporates a model of 
the external signals. That some additional condition - such as structural stability - must be imposed 
is obvious, since the system E which simply outputs y = for every possible input signal u does not 
contain any subsystem generating the signals u. We will impose instead a condition which amounts to 
a "signal detection" property: the output must reflect sudden changes in the input. 

Recent work in molecular biology, cf. has suggested that the IMP could help guide experimen- 
talists and modelers: if certain characteristics of a system adapt to signals in a given class (in all the 
examples so far, constant inputs, such as for instance y{t) — the relative "activity" of enzymes controlling 
motors in E.coli chemotaxis, with respect to u{t) = concentration of extracellular ligand) then the IMP 
could, in principle, help distinguish among mathematical models which do or do not contain internal 
models. 

With a view toward such biological applications, it is desirable to have available a theorem which 
(a) applies to nonlinear systems E, at least under reasonable technical assumptions, and (b) does not 
require the system E to be split between "plant" and "controller" subsystems, nor requires structural 
stability (robustness) in the sense of the Francis/ Wonham theory (which would imply, in the case of 
the E.coli motor control network, that the system should perfectly adapt even if there are arbitrary 
direct connections between the external ligands and the motor signals, a matter which seems difficult 
to check experimentally). We present one very elementary and self-contained such result in this note. 
It basically just picks from and "repackages" some of the basic concepts and techniques developed by 
previous researchers for the same problems, in particular: the use of differential geometric techniques and 
"output-zeroing" sets (||l^, [Q-(7), |l^), dynamical systems notions like omega- limit sets ([|,§,|T§), and 
system decompositions motivated by the Center Manifold Theorem ([^,|^,Q,|10 ). Isidori's excellent 



textbooks [|llj,||12| should be consulted for a far deeper discussion of many of the issues raised here. 

Precise mathematical definitions are provided in Section ^j. On the other hand, since the linear 
version of the result is very easy to explain, we sketch that case first. (The discussion assumes some 
familiarity with frequency domain techniques, and may be skipped without loss of continuity.) 
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1.1 Linear Case 

Let us denote by S the transfer function of the system S: if y is the output produced when E starts at 
the zero initial state and is fed input u, then the relation y{s) = S{s)u{s) holds between the Laplace 
transforms y{s), u{s) of the output and input. One expresses S{s) = si^i as the quotient of two 
relatively prime polynomials, with the degree of p less than the degree of q. (An equivalent discussion 
using differential operators instead of Laplace transforms is also possible, see e.g. Section 6.7 in The 
first observation, a well-known fact in systems theory, is that the zeroes of p can be viewed, alternatively, 
as poles of a feedback subsystem. To see this, we assume that p is not identically zero, and divide the 
polynomial q by p, obtaining q — ap -\- b, where b is some polynomial of degree less than p. Now, as 
the algebraic equality y = is equivalent to y = ^{u — ^y), we conclude that the system E can be 
decomposed as in Figure ||. For example, if s = is a zero of S (that is, is a root of p), which amounts 
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Figure 3: System Equivalent to E: Closed-loop Zeros are Feedback Poles 
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to the property that constant signals get differentiated by E (the "DC gain" of E is zero), then the factor 
1/s appears in the feedback box b/p, and can be interpreted as an integrator of the output y. 

We will show that, in general, the subsystem with transfer function ^ models all inputs which E 
adapts to. Let us suppose that the class U of inputs can be described as the set of all possible solutions 
of a fixed linear differential equation 

u^'^'>{t)+biu^'^-^\t) + ... + be-iu'{t) + biu{t) = 

for some integer £, and which has no stable modes. (Stable modes, giving components of u which converge 
to zero, are less interesting, since they do not represent persistent disturbances.) We view these signals u 
as the outputs of an "exosystem" F which is obtained by rewriting the differential equation as a system 
of £ first order equations. Figure ^ shows a cascade consisting of the original system E and the exosystem 
F which generates the inputs in U. (If, for example, U — constant inputs, then one would let F be the 
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Figure 4: Exosystem and System in Cascade 



system with equation w = and output u — w, and for each initial condition 1/7(0) = one obtains 
a different constant output u{t) = vP .) The regulation objective is now simply that y{t) —> for all 
possible initial conditions of the composite system, i.e. for all initial conditions of the original system E 
and all initial conditions of the exosystem F, the latter corresponding to all possible inputs in U being 
fed to E. 

Next, we reformulate this regulation property by adding an external input v{-) to the exosystem, and 
requiring now that y(t) — > for all possible stable inputs {v{t) — > as i ^ 00) but only when starting 
from the zero initial state. (Such replacements of initial states by stable forcing inputs - assuming 
natural controllability/observability conditions - are elementary exercises in linear systems theory, see 
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Figure 5: Exosystem and System, Forced by Stable Inputs 



e.g. the proof of Theorem 33 in In other words, we have now the situation iUustrated by Figure |5|. 

We denote by G the transfer function of the exosystem F: G — ^, where 

7r(s) = + bis'^-^ + . . . + be^is + be . 

To see that the subsystem with transfer function b/p includes an internal model of F, we argue as follows. 
The regulation property for the cascade in Figure ^ means that the product rational function GS is stable 
(all poles have negative real parts) , while the assumption that G had no stable modes means that all the 
poles of G (i.e, the roots of the polynomial tt) have nonnegative real parts. Therefore, these poles must 
be canceled in the product GS] in other words, S must have among its zeroes all the poles of G, so that 
we can write p = npQ for some polynomial Pq. Thus b/p = b/lirpo). One may now factor b = bib2 in such 
a way that the degree of &2 is less than the degree of tt, so that b/p — (6i/po)(^2/jr) and now the system 
with transfer function b/p can be written itself in the cascade form in Figure |6|. The subsystem with 
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Figure 6: Decomposition of - 



transfer function b2/'^ generates all the inputs in U, since one may write a set of differential equations 
for it which is exactly the same as for the exosystem F, changing only the output mapping ("controller 
form" realization). 

Since the tools of transfer functions are not available for nonlinear systems, a different approach is 
required in general. 



2 Definitions and Statement of Result 

We consider single-input single-output systems S, afhne in inputs: 

i{t) = f{x{t)) + u{t)g{x{t)) , y{t) = h{x{t)) (1) 

(dot indicates derivative with respect to time, and the arguments t will be omitted from now on; see fl^ 
for general definitions and properties of systems with inputs). Here x{t), u{t), and y{t) represent the 
state, input, and output at time i, / and g are smooth vector fields on M" (n is the dimension of the 
system), h is a scalar smooth function R" — > R, and /(O) — h{0) = 0. (Several assumptions on / and g 
will be made later.) A special case is that of linear systems 

X = Ax + ub , y = cx (2) 

where A is an n x n matrix, 5 is a column n-vector, and c is a row n-vector. 

Suppose given a class lA of functions [0, 00) — > M (such as for example the set of all constant functions). 
We say that S adapts to inputs in U (a more appropriate technical control-theoretic term would be 
"asymptotically rejects disturbances in U") if the following property holds: for each u E U and each 
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initial state S M", the solution of with initial condition x(0) = exists for all t > and is 
bounded, and the corresponding output y{t) = h{x{t)) converges to zero as t — > oo. 

We will say that S contains an output-driven internal model of hi if there is a change of coordinates 
which brings the equations (|l|) into the following block form: 

Zl = fi{zi, Z2) + ugi{zi, Z2) 

Z2 = h{y,Z2) (3) 

y = i^{zi) 

(the subsystems with variables zi and Z2 correspond respectively to So and Sjm in Figure and in 
addition the subsystem with state variables Z2 is capable of generating all functions in U, meaning the 
following property: there is some scalar function ^{z2) so that, for each possible u ^ U, there is some 
solution of 

i2-/2(0,Z2) (4) 

which satisfies (p{z2{t)) = u{t). 

The precise meaning of "change of coordinates" is as follows. There must exist an integer r < n, 
differentiable manifolds Zi and Z2 of dimensions r and n — r respectively, a smooth function k : Zi — > R, 
vector fields F and G on Zi x Z2 which take the partitioned form 

p^f fi(zi,Z2) \ ^ ^ fgi(zi,Z2)\ 
\f2(K(zi),Z2)J ' \ J 

and a diffeomorphism $ : M" Zi x Z2, such that 

for all X G M", where $1 is the Zi-component of $ and star indicates Jacobian. 

Our result will hold under additional conditions on the vector fields defining the system. The first 
condition is the fundamental one from an intuitive point of view, namely that the system is able to 
detect changes in the input signal: 



Assumption 1: a uniform relative degree exists. 



This means that there exists some positive integer r such that 

LgL)h = Q Vfc<r-1 

and 

LgLy^h{x) ^0 Vx G M" 

where, as usual, Lxh indicates the directional derivative ("Lie derivative") of a function h along the 
direction of the vector field X, that is {Lxh){x) — Vh{x) ■ X{x). The integer r, if it exists, is called the 
relative degree of S. It is possible to prove (see ||ll|] ) that when r exists, necessarily r < n. 

For a linear system (^), existence of a relative degree amounts to simply asking that cA^b is nonzero 
for some i, or equivalently that the transfer function c{sl — A)^^b is not identically zero. For general 
systems (|l|), the assumption is equivalent to the statement that the output derivatives y^^\t) must be 
independent of the value of the input at time for all k < r, but that y^^\t) = h{x{t)) + a{x{t))u{t) 
for some function a(x) which is everywhere nonzero (so that the system can be "inverted" to obtain the 
instantaneous value u{t) from instantaneous derivatives). See also jlSj for a discussion of the character- 
ization of r in terms of smoothness of outputs when inputs are discontinuous (change detection) . 

The next two conditions are of a technical nature. They are automatically satisfied for linear systems. 
For nonlinear systems, we need such conditions in order to guarantee the existence of a change of variables 
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exhibiting the system Exm- (Weaker conditions may be given, if one is merely interested in a local result, 
or if one is willing to accept a subsystem Ejm which is driven by not just y but also several derivatives 
of y.) We are guided by conditions which appear in Isidori's 

Assuming that the degree is r, we introduce the following vector fields: 

^(2^) ^ T rr^i./ . 9{x), f{x) = /(a;) - (i^/i(x))g(a;) , r, := ad^-T^ff, i = 1, . . . r , 
LgLj h{x) \ J I 

where adx is the operator ad^^ = [^j^] = Lie bracket of the vector fields X and Y. Recall that a 
vector field X is said to be complete if the solution of the initial value problem x = X(x), x(Q) — is 
defined for all i G R, for any initial state x^ , and that two vector fields X and Y are said to commute if 
[X, Y]=Q. The assumptions are: 



Assumption 2: is complete, for i = 1, 



Assumption 3: the vector fields commute with each other. 



(For linear systems, the vector fields n are all constant, so that they are indeed complete and pairwise 
commutative.) 

Finally, we must define the allowed classes of inputs U. As usual in control theory (see also the 



discussion in Section 1.1), we will assume that inputs are generated by exosystems. That is, there is 
given a system F: 

w^Q{w), u^e{w) (5) 

(let us say evolving on some differentiable manifold, Q a smooth vector field, and 9 a real-valued smooth 
function, although far less than smoothness is needed) such that the input class U consists exactly 
of those inputs u{t) = 9{w{t)), t > 0, for all possible solutions of w = Q{w). For example, if we are 
interested in constant signals, we pick w = 0, u — w and if we are interested in sinusoidals with frequency 
u) then we use xi — X2, X2 — —uj'^xi, u ^ xi. It is by now standard in nonlinear studies of necessary 
conditions for regulation to impose conditions on omega limits sets for trajectories of the exosystem, 
see 0,0; we will follow the approach in jlOj-||l^ and assume that the exosystem is Poisson-stable: for 
every state w", the solution u;(-) of ?i; = Q{w), w{0) = is defined for all t > and it satisfies that 
w'^ is in the omega-limit set of w, that is, there is a sequence of times ti —> 00 such that the sequence 
w{ti) converges to w'^ as t ^ 00. This means that the exosystem is almost-periodic in the sense that 
trajectories keep returning to neighborhoods of the initial state. 

This theorem is proved in Section |^: 

Theorem 1 // Assumptions 1-3 hold and the system S adapts to inputs in a class U generated by a 
Poisson-stable exosystem, then S contains an output-driven internal model ofU. 



2.1 An Example 

As an example, consider the model for E.coli chemotaxis adaptation to constant inputs given in j^]. 
Section 2.2. Letting xi — R and X2 = RL be the concentrations of unbound and bound receptors re- 
spectively, and taking the external ligand concentration u = L as input, we have the following equations: 

ii — ai — 02X1 -\- 03X2 — a^xiu , , 

^2 — 0-5 ~ CL&X2 -\- a^xiu 

for suitable positive constants ai, 

/ = 



, ag. In terms of vector fields. 



Oi — 02X1 -I- a3X2 

a5 - aQX2 





f-aiXi\ 


), , 




V 0L4,Xi J 
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and, still as in we take as output y the difference between the total concentration of active receptors 
and a steady state level of this activity. In terms of the notations used here, and up to multiplication 
by a suitable constant, this amounts to the following choice: 

h{x) = Aq- A = [ai+ as] - [a2Xi + (og - a3)x2] ■ 

We note that Lgh — Dxi, where D = a2ai + (03 — ag)a4. Except in the accidental case when this 
constant D vanishes (in terms of the notations in D — k^ilTkr{ai — 02), so D can only vanish if 
ai = a2), we have that Lgh(x) ^ for all x {xi > 0, as it represents a concentration), and so it follows 
that E has well-defined relative degree r = 1. Moreover, ti — g is a constant vector, so Assumptions 2 
and 3 hold as well. 

A minor technicality concerns the assumptions that our systems (||) evolve in all of Euclidean state 
space (not just Xi > 0) and that /(O) = h{0) = 0. However, this is just a matter of picking the right 
coordinates. Notice that / vanishes at x° = (xJjX^), where x^ = (aiOg + a3a5)/(a2a6) and X2 — as/ag, 
and h vanishes at x'^ too. In order to fit into the general theory, one simply changes variables, mapping 
the positive orthant into all of and into the origin by means of = \nxi — Inx^. (Of course, 
there is no need to actually perform the coordinate change, since conditions expressed in terms of Lie 
derivatives are covariant.) 

Finally, letting B := xi + X2 (as done in one obtains a system of equations in terms of the new 
variables A and B, for which B = y. This last equation represents an integrator (internal model of a 
system which produces constant inputs) driven by the output y. (Of course, there is no point in applying 
the theorem, since once that the model is given we can find the internal model explicitly.) 



3 Proof of Theorem |I] 



Suppose that the system E adapts to inputs in U, which are produced by a Poisson stable exosystem 
r. We consider the interconnected system consisting of the cascade of F and E, as shown in Figure ^, 
namely: 

w = Q{w) , . 

X = f{x) + e{w)g{x) ^ ' 

and let Z denote the set consisting of those states a; of E for which h{x) = (the "output-zeroing" 
subset). 



Lemma 3.1 For each uP there is some solution a — [w{-)^x{-)) of the composite system ([7|) such that 
w(0) = and x{t) e Z for aU t > 0. 

Proof. Wc start by picking an arbitrary solution cto = (w(-),a;(-)) of the composite system (Q) such that 
ui(0) = vlP , and let Q, = ri+[(To] be the omega-limit set of this trajectory. We claim that, for each point 
(w,^) G O (we partition coordinates into those for F and E) it must be the case that Z. Indeed, by 
definition of 11 there is some sequence of times ti 00 such that x{ti) — > Since h{x{ti)) because 
of the adaptation property and h is continuous, it follows that /i(^) = 0, as claimed. Next, we claim that 
there is some a;° such that (w^, x^) € fi. To see this, we first pick a sequence of times ti ^ 00 such that 
w{ti) (Poisson stability is used here); as {x{ti)} is bounded, we may pick a subsequence ti- of the 

ti so that x{ti.) a;° for some 2:°, and this proves that {w'^,x'^) € ft, as wanted. 

Finally, we let a be the solution a = {w{-),x{-)) of the composite system (^ for which w{0) = w° 
and a;(0) = where x° is so that {w'^,x'^) S 11. Omega-limit sets are invariant, so a-{t) G for all 
t > 0, and we already proved that this last property implies that x{t) G Z. I 

Proposition 9.1.1 in ||ll[ shows that there is a global diffeomorphism <& so that, in the new coordinates, 
the system E takes the form shown in Display (H). Moreover, the subsystem described by zi evolves in 
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IR'' and, using coordinates zi = (Ci, ■ ■ • , Cr), the equations for zi can be written as foUows: 



Cl = C2 

1 

Cr = b{zi,Z2) + a{zi,Z2)u 

where the output is y = = (i and a, b are smooth functions with a{z) = LgL^j:^^ h{^^^ (z)) ^ for 

all z. We let 

f ^ KO.Z2) 

V[z2) := 7- 

0(0,2:2) 

and show that for each possible u there is some solution of (||) which satisfies Lp{z2{t)) = u(t). 

We pick ui" such that u{t) — 9{w{t)) and 'w{0) = t«°, and view the interconnection of F and S in 
terms of the coordinate change given by $ on E: 

w = Q{w) 

Zl = h{zi,Z2) + 9(w)gi{zi,Z2) 

Z2 = f2{y,Z2)- 



Lemma ^jj gives us the existence of a solution a — {w{-), zi{-), Z2{-)) such that 0{w{t)) — uit) and 
C,i{t) = 0. Because of the form (||) of the zi-subsystem, this implies that zi{t) = and that = 0. 
Thus, along the solution a one has 6(0, Z2{t)) + a(0, Z2{t))u{t) = 0, and this is precisely what we wished 
to prove. I 



A Remark on Subsystems 



We expressed our theorem in terms of the existence of solutions which reproduce all inputs. Under 
additional and stronger hypotheses, one could also obtain an actual embedding of the exosystem in the 
internal model Ejm • A full nonlinear version would involve abstract quotients of systems under suitable 
equivalence relations, and may follow along the lines of the work in Q| (based on |^). However, the 
necessary steps are easy to understand and prove in the case of linear systems. We start by showing the 
following elementary fact from linear systems theory 



Lemma 3.2 Suppose given an observable linear system w = Qw, y = 9w and another linear system 
Z2 = Fz2 + Gy, u = Lpz, and assume that for each uP there is some z° such that tpe*^ z^ = Oe'^'^uf' for 
all t > 0. Then, the matrix F is similar to a matrix with this block structure: 

Q 0\ 

D E q \ . (9) 
F G H) 



Proof. We first assume that the pair (i^, Lp) is observable, and claim that for each uP there is a unique 
such that ife*^z^ = Oe^^uP . This is because ipe*^ z^ = (pe*^^ z^ implies z'^ = z^ (observability). So we can 
define a map T : w'^ ^ z^. This map is one-to-one, by observability of the pair {Q,0). It is also linear, 
since Oe^^iaw" + w^) = aOe^'^w^ + 6'e**^w^ = a(pe*^Tw° + ipe^^Tw^ = Lpe*^^{aTw^ + Tw^) means that 
auP + ^ aTw^ + Tw^. It also satisfies FT = TQ, since taking derivatives in ipe^^Tw^ = Oe^^w'^ 
gives ipe^^ FTuP = 9e*'^Qw^ which means that Qw'^ 1-^ FTw^. Thus, on some invariant subspace (the 
range of T), F can be written as Q, which means that we can write F up to similarity in the form 

). Since F is similar to its transpose, and Q is similar to its transpose, F is also similar to 
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Q 0^ 

a matrix in the form ( ). An observability decomposition Chapter 6) then reduces to the 



observable case. I 
Without loss of generality, one may assume that linear exosystems are observable (there always 



exists an observable equivalent). We now apply Lemma 3.2 to the exosystem and the internal model 
Sim, assumed linear. There results a change of variables for Sim so that, in the new variables, a subset 
C of the variables Z2 of Sim i corresponding to the first block in , evolves according to an equation of 
the form ^ = + by, for a suitable vector b. This provides the desired embedding of the exosystem in 
the internal model. 
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